Semi-Automatic Entity Set Refinement

نویسندگان

Vishnu Vyas

Patrick Pantel

چکیده

State of the art set expansion algorithms produce varying quality expansions for different entity types. Even for the highest quality expansions, errors still occur and manual refinements are necessary for most practical uses. In this paper, we propose algorithms to aide this refinement process, greatly reducing the amount of manual labor required. The methods rely on the fact that most expansion errors are systematic, often stemming from the fact that some seed elements are ambiguous. Using our methods, empirical evidence shows that average R-precision over random entity sets improves by 26% to 51% when given from 5 to 10 manually tagged errors. Both proposed refinement models have linear time complexity in set size allowing for practical online use in set expansion systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fine-grained Entity Set Refinement with User Feedback

State of the art semi-supervised entity set expansion algorithms produce noisy results, which need to be refined manually. Sets expanded for intended fine-grained concepts are especially noisy because these concepts are not well represented by the limited number of seeds. Such sets are usually incorrectly expanded to contain elements of a more general concept. We show that fine-grained control ...

متن کامل

Concept Detector Refinement on Social Videos

The explosion of the social video sharing sites gives new challenges on video search and indexing technique. Because of the concept diversity in social videos, it is very hard to build a well annotated dataset that provides good coverage over the whole meaning of concepts. However, the prosperity of social video also make it easy to obtain a huge number of videos, which gives an opportunity to ...

متن کامل

Heuristics on the Definition of UML Refinement Patterns

In this article we present a strategy to formalize frequently occurring forms of refinement that take place in UML model construction. Such strategy consists in recognizing a set of well founded refinement structures in a formal language which are then immersed into a UML-based development, giving origin to a set of UML refinement patterns. Apart from providing semi-formal evidence on the prese...

متن کامل

EASEAndroid: Automatic Policy Analysis and Refinement for Security Enhanced Android via Large-Scale Semi-Supervised Learning

Mandatory protection systems such as SELinux and SEAndroid harden operating system integrity. Unfortunately, policy development is error prone and requires lengthy refinement using audit logs from deployed systems. While prior work has studied SELinux policy in detail, SEAndroid is relatively new and has received little attention. SEAndroid policy engineering differs significantly from SELinux:...

متن کامل